I’d kinda like to wrap this whole section in a thought-bubble, or quote block, or color, or something, to indicate that the entire section is “what it looks like from inside a human’s mind”. So e.g. from inside my mind, it looks like we humans learn about our values. And then outside that bubble, we can ask “are there any actual ‘values’ which we’re in fact learning about”?
Seems accurate to me. This has been an exercise in the initial step(s) of CCC, which indeed consist of “the phenomenon looks this way to me. It also looks that way to others? Cool. What are we all cottoning on to?”
Indeed, our beliefs-about-values can be integrated into the same system as all our other beliefs, allowing for e.g. ordinary factual evidence to become relevant to beliefs about values in some cases.
Super unclear to the uninitiated what this means. (And therefore threateningly confusing to our future selves.)
Maybe: “Indeed, we can plug ‘value’ variables into our epistemic models (like, for instance, our models of what brings about reward signals) and update them as a result of non-value-laden facts about the world.”
an agent could aim to pursue any values regardless of what the world outside it looks like; “how the external world is” does not tell us “how the external world should be”.
Extremely delicate wording dancing around the “should be” vs “should be according to me” distinction, with embeddedness allowing facts to update “should be according to me” without crossing the is-ought gap… in principle.
Wait. I thought that was crossing the is-ought gap. As I think of it, the is ought gap refers to the apparent type-clash and unclear evidential entanglement between facts-about-the-world and values-an-agent-assigns-to-facts-about-the-world. And also as I think of it, “should be” always is short hand for “should be according to me” though possibly means some kind of aggregated thing but also ground out in subjective shoulds.
So “how the external world is” does not tell us “how the external world should be” …. except in so far as the external world has become causally/logically entangled with a particular agent’s ‘true values’. (Punting on what are an agent’s “true values” are as opposed to the much easier “motivating values” or possibly “estimated true values.” But for the purposes of this comment, its sufficient to assume that they are dependent on some readable property (or logical consequence of readable properties) of the agent itself.)
It does seem like humans have some kind of physiological “reward”, in a hand-wavy reinforcement-learning-esque sense, which seems to at least partially drive the subjective valuation of things.
Hrm… If this compresses down to, “Humans are clearly compelled at least in part by what ‘feels good’.” then I think it’s fine. If not, then this is an awkward sentence and we should discuss.
an agent could aim to pursue any values regardless of what the world outside it looks like;
Without knowing what values are, it’s unclear that an agent could aim to pursue any of them. The implicit model here is that there is something like a value function in DP which gets passed into the action-decider along with the world model and that drives the agent. But I think we’re saying something more general than that.
Every time the wording of a sentence implies that there are, in fact, some values which someone has or estimates, I picture the adorable not-so-sneaky elephant.
“learn” in the sense that their behavior adapts to their environment.
I want a new word for this. “Learn” vs “Adapt” maybe. Learn means updating of symbolic references (maps) while Adapt means something like responding to stimuli in a systematic way.
The internal heuristics or behaviors “learned” by an adaptive system are not necessarily “about” any particular external thing, and don’t necessarily represent any particular external thing
Yeeeahhh.… But maybe it’s just awkwardly worded rather than being deeply confused. Like: “The learned algorithms which an adaptive system implements may not necessarily accept, output, or even internally use data(structures) which have any relationship at all to some external environment.” “Also what the hell is ‘reference’.”
Adaptive systems “learn” things, but they don’t necessarily “learn about” things; they don’t necessarily have an internal map of the external territory.
Seconded. I have extensional ideas about “symbolic representations” and how they differ from.… non-representations.… but I would not trust this understanding with much weight.
I’d kinda like to wrap this whole section in a thought-bubble, or quote block, or color, or something, to indicate that the entire section is “what it looks like from inside a human’s mind”. So e.g. from inside my mind, it looks like we humans learn about our values. And then outside that bubble, we can ask “are there any actual ‘values’ which we’re in fact learning about”?
Seems accurate to me. This has been an exercise in the initial step(s) of CCC, which indeed consist of “the phenomenon looks this way to me. It also looks that way to others? Cool. What are we all cottoning on to?”
Super unclear to the uninitiated what this means. (And therefore threateningly confusing to our future selves.)
Maybe: “Indeed, we can plug ‘value’ variables into our epistemic models (like, for instance, our models of what brings about reward signals) and update them as a result of non-value-laden facts about the world.”
Ahhhh
Maybe: “But presumably the reward signal does not plug directly into the action-decision system.”?
Or: “But intuitively we do not value reward for its own sake.”?
language
Extremely delicate wording dancing around the “should be” vs “should be according to me” distinction, with embeddedness allowing facts to update “should be according to me” without crossing the is-ought gap… in principle.
Wait. I thought that was crossing the is-ought gap. As I think of it, the is ought gap refers to the apparent type-clash and unclear evidential entanglement between facts-about-the-world and values-an-agent-assigns-to-facts-about-the-world. And also as I think of it, “should be” always is short hand for “should be according to me” though possibly means some kind of aggregated thing but also ground out in subjective shoulds.
So “how the external world is” does not tell us “how the external world should be” …. except in so far as the external world has become causally/logically entangled with a particular agent’s ‘true values’. (Punting on what are an agent’s “true values” are as opposed to the much easier “motivating values” or possibly “estimated true values.” But for the purposes of this comment, its sufficient to assume that they are dependent on some readable property (or logical consequence of readable properties) of the agent itself.)
Needs jargon
also needs jargon
...
wiggitywiggitywact := fact about the world which requires a typical human to cross a large inferential gap.
wact := fact about the world
mact := fact about the mind
aact := fact about the agent more generally
vwact := value assigned by some agent to a fact about the world
Spitballing:
“local fact” vs “global fact” (to evoke local/global variables)
“local fact” vs “interoperable fact”
“internal fact” vs “interoperable fact”
“fact valence” for the value stuff
Hrm… If this compresses down to, “Humans are clearly compelled at least in part by what ‘feels good’.” then I think it’s fine. If not, then this is an awkward sentence and we should discuss.
Without knowing what values are, it’s unclear that an agent could aim to pursue any of them. The implicit model here is that there is something like a value function in DP which gets passed into the action-decider along with the world model and that drives the agent. But I think we’re saying something more general than that.
Better terminology for the phenomenon of “making sense” in the above way?
Every time the wording of a sentence implies that there are, in fact, some values which someone has or estimates, I picture the adorable not-so-sneaky elephant.
I want a new word for this. “Learn” vs “Adapt” maybe. Learn means updating of symbolic references (maps) while Adapt means something like responding to stimuli in a systematic way.
… there’s a whole fucking elephant swept under that rug. I can see its trunk peaking out. It’s adorable how sneaky that elephant thinks it’s being.
We have at least one jury rigged idea! Conceptually. Kind of.
I give up.
Yeeeahhh.… But maybe it’s just awkwardly worded rather than being deeply confused. Like: “The learned algorithms which an adaptive system implements may not necessarily accept, output, or even internally use data(structures) which have any relationship at all to some external environment.” “Also what the hell is ‘reference’.”
So much screaming
more scream
Seconded. I have extensional ideas about “symbolic representations” and how they differ from.… non-representations.… but I would not trust this understanding with much weight.
scream
Seconded. Comments above.